37 research outputs found
Identifying beneficial task relations for multi-task learning in deep neural networks
Multi-task learning (MTL) in deep neural networks for NLP has recently
received increasing interest due to some compelling benefits, including its
potential to efficiently regularize models and to reduce the need for labeled
data. While it has brought significant improvements in a number of NLP tasks,
mixed results have been reported, and little is known about the conditions
under which MTL leads to gains in NLP. This paper sheds light on the specific
task relations that can lead to gains from MTL models over single-task setups.Comment: Accepted for publication at EACL 201
Latent Multi-task Architecture Learning
Multi-task learning (MTL) allows deep neural networks to learn from related
tasks by sharing parameters with other networks. In practice, however, MTL
involves searching an enormous space of possible parameter sharing
architectures to find (a) the layers or subspaces that benefit from sharing,
(b) the appropriate amount of sharing, and (c) the appropriate relative weights
of the different task losses. Recent work has addressed each of the above
problems in isolation. In this work we present an approach that learns a latent
multi-task architecture that jointly addresses (a)--(c). We present experiments
on synthetic data and data from OntoNotes 5.0, including four different tasks
and seven different domains. Our extension consistently outperforms previous
approaches to learning latent architectures for multi-task problems and
achieves up to 15% average error reductions over common approaches to MTL.Comment: To appear in Proceedings of AAAI 201
Named entity tagging a very large unbalanced corpus: training and evaluating NE classifiers
We describe a systematic and application-oriented approach to training and evaluating named entity recognition and classification (NERC) systems, the purpose of which is to identify an optimal system and to train an optimal model for named entity tagging DeReKo, a very large general-purpose corpus of contemporary German (Kupietz et al., 2010). DeReKo 's strong dispersion wrt. genre, register and time forces us to base our decision for a specific NERC system on an evaluation performed on a representative sample of DeReKo instead of performance figures that have been reported for the individual NERC systems when evaluated on more uniform and less diverse data. We create and manually annotate such a representative sample as evaluation data for three different NERC systems, for each of which various models are learnt on multiple training data. The proposed sampling method can be viewed as a generally applicable method for sampling evaluation data from an unbalanced target corpus for any sort of natural language processing
Disembodied Machine Learning: On the Illusion of Objectivity in NLP
Machine Learning seeks to identify and encode bodies of knowledge within
provided datasets. However, data encodes subjective content, which determines
the possible outcomes of the models trained on it. Because such subjectivity
enables marginalisation of parts of society, it is termed (social) `bias' and
sought to be removed. In this paper, we contextualise this discourse of bias in
the ML community against the subjective choices in the development process.
Through a consideration of how choices in data and model development construct
subjectivity, or biases that are represented in a model, we argue that
addressing and mitigating biases is near-impossible. This is because both data
and ML models are objects for which meaning is made in each step of the
development pipeline, from data selection over annotation to model training and
analysis. Accordingly, we find the prevalent discourse of bias limiting in its
ability to address social marginalisation. We recommend to be conscientious of
this, and to accept that de-biasing methods only correct for a fraction of
biases.Comment: In revie
KoralQuery - a General Corpus Query Protocol
The task-oriented and format-driven development of corpus query systems has led to the creation of numerous corpus query languages (QLs) that vary strongly in expressiveness and syntax. This is a severe impediment for the interoperability of corpus analysis systems, which lack a common protocol. In this paper, we present KoralQuery, a JSON-LD based general corpus query protocol, aiming to be independent of particular QLs, tasks and corpus formats. In addition to describing the system of types and operations that Koral- Query is built on, we exemplify the representation of corpus queries in the serialized format and illustrate use cases in the KorAP project